Alternative methods for panel combination in rFIA
logo



library(rFIA)
library(dplyr)
library(rgdal)
library(sf)
library(lubridate)
library(gridExtra)
library(data.table)

Andy and I have been rolling out some updates to rFIA since the Stakeholder’s meeting a few weeks back. Namely, we have made improvements in terms of speed and RAM utilization and have implemented a few methods as alternatives to the temporally indifferent method for panel combination. This document is intended to serve as a brief intro to our updates and highlight the potential for future work on this front.

The speed up


Using the Rhode Island subset for the years 2014-2018 we see substantial improvements in efficiency, even with a large number of grouping variables.

## Old version
start <- now()
old <- rFIA:::tpa_old(fiaRI, bySpecies = TRUE, bySizeClass = TRUE, nCores = 3)
old_time <- now() - start

## New version
start <- now()
new <- tpa(fiaRI, bySpecies = TRUE, bySizeClass = TRUE, nCores = 3)
new_time <- now() - start

paste('Change in run time:', (as.numeric(new_time) - as.numeric(old_time)) / as.numeric(new_time) * 100, '%')
## [1] "Change in run time: -447.229885107249 %"

The advancements really can’t be appreciated until we scale up the size of the input data. Using the entire US database, we can return biomass estimates (8 variables and sampling errors), grouped by inventory year and ecoregion subsection, in under 18 minutes. (Note: this was using 19 cores on a Linux server, and we saw a maximum of 90 Gb of RAM utilization during processing).

New methods


We’ve focused our efforts on implementing methods identified in the Green Book as potential alternatives to the temporally indifferent method. Specifically, we now offer the following methods: annual (no panel combination), simple moving average, linear moving average, exponential moving average, and of course, temporally indifferent.

We wanted to make it easy for users to implement these alternative methods, so we simply added a method argument to all estimator functions. For the exponential moving average, users may also change the decay parameter that controls panel weighting using the lamda argument (more on this later).

Using Michigan ash as a case study, we can see quite a bit of variation in temporal lag bias between the methods:

## Get Michigan
mi <- readFIA('C:/Users/hstan/Dropbox/rfia docs/MI')

## Michigan LP shapefile
shp <- readOGR('C:/Users/hstan/Dropbox/rfia docs/EMS paper/mi_counties', 'mi_counties')
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\hstan\Dropbox\rfia docs\EMS paper\mi_counties", layer: "mi_counties"
## with 68 features
## It has 15 fields
shp <- as(shp, 'sf')

## White, black, green ash
ash <- c(541, 543, 544)

## Trying out a few new estimators for MI ash
ti <- biomass(mi, polys = shp, method = 'TI',
              treeDomain = SPCD %in% ash & DIA >=5, nCores = 3, 
              returnSpatial = TRUE)

an <- biomass(mi, polys = shp, method = 'annual',
              treeDomain = SPCD %in% ash & DIA >=5, nCores = 3, 
              returnSpatial = TRUE)

sma <- biomass(mi, polys = shp, method = 'sma',
               treeDomain = SPCD %in% ash & DIA >=5, nCores = 3, 
               returnSpatial = TRUE)

lma <- biomass(mi, polys = shp, method = 'lma',
              treeDomain = SPCD %in% ash & DIA >=5, nCores = 3, 
              returnSpatial = TRUE)

ema <- biomass(mi, polys = shp, method = 'ema',
               treeDomain = SPCD %in% ash & DIA >=5, nCores = 3, 
               returnSpatial = TRUE)

tip <- filter(ti, YEAR %in% c(2003, 2007,  2011, 2015))
anp <- filter(an, YEAR %in% c(2003, 2007,  2011, 2015))
smap <- filter(sma, YEAR %in% c(2003, 2007,  2011, 2015))
lmap <- filter(lma, YEAR %in% c(2003, 2007,  2011, 2015))
emap <- filter(ema, YEAR %in% c(2003, 2007,  2011, 2015))

allp <- bind_rows(tip, anp, smap, lmap, emap)

#tiPlot <- 
  plotFIA(tip, BIO_AG_ACRE, facet = TRUE, line.width = .1,
        plot.title = 'Temporally Indifferent', limits = c(0, 35),
        transform = 'sqrt', legend.title = 'AG Biomass per Acre', legend.height = .7)

#anPlot <- 
  plotFIA(anp, BIO_AG_ACRE, facet = TRUE, line.width = .1,
                  plot.title = 'Annual', limits = c(0, 35),
                  transform = 'sqrt', legend.title = 'AG Biomass per Acre', legend.height = .7)

#smPlot <- 
  plotFIA(anp, BIO_AG_ACRE, facet = TRUE, line.width = .1,
                  plot.title = 'Simple MA', limits = c(0, 35),
                  transform = 'sqrt', legend.title = 'AG Biomass per Acre', legend.height = .7)

#lmPlot <- 
  plotFIA(anp, BIO_AG_ACRE, facet = TRUE, line.width = .1,
                  plot.title = 'Linear MA', limits = c(0, 35),
                  transform = 'sqrt', legend.title = 'AG Biomass per Acre', legend.height = .7)

#emPlot <- 
  plotFIA(anp, BIO_AG_ACRE, facet = TRUE, line.width = .1,
                  plot.title = 'Exponential MA', limits = c(0, 35),
                  transform = 'sqrt', legend.title = 'AG Biomass per Acre', legend.height = .7)

#grid.arrange(tiPlot, anPlot, smPlot, lmPlot, emPlot)

The exponential moving average


The exponential moving average differs from the simple and linear moving averages in that it allows the user to modify the weighting scheme using a decay parameter, lambda, that varies between 0 and 1. Low lambda values will place higher weight on more recent observations, so as lambda approaches 0, the exponential moving average will approach the annual method for panel combination.

Here we show the effect of varying lambda estimates of Michigan ash populations over time:

## Varying the decay parameter for the EMA
ema_lambda <- biomass(mi, method = 'ema',
                      treeDomain = SPCD %in% ash & DIA >=5, nCores = 3, 
                      lambda = seq(.35, .95, .05))

## MA ribbons
plotFIA(ema_lambda, BIO_AG_ACRE, grp = lambda, x.lab = "", 
        y.lab = 'AG Biomass (tons/acre)', legend.title = 'Decay Parameter',
        plot.title = "Exponential Moving Average")

plotFIA(ema_lambda, BIO_AG_ACRE_SE, grp = lambda, x.lab = "", 
        y.lab = 'Sampling Error (%)', legend.title = 'Decay Parameter',
        plot.title = "Exponential Moving Average")

US Runs


As I mentioned above, we’ve done some preliminary runs with the full US database (all reporting years). Below is a gifs that show changes in aboveground biomass over time estimating using the simple moving average. Moving forward, I think it would be exciting to implement some predictive methods to fill in gaps in reporting/measurement years across space.

bioEco_sma <- fread('C:/Users/hstan/Dropbox/usMA/bioEco_sma.csv')

eco <- readOGR('C:/Users/hstan/Dropbox/usMA/S_USA.EcomapSubsections/S_USA.EcomapSubsections', 'S_USA.EcomapSubsections')
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\hstan\Dropbox\usMA\S_USA.EcomapSubsections\S_USA.EcomapSubsections", layer: "S_USA.EcomapSubsections"
## with 1233 features
## It has 14 fields
## Integer64 fields read as strings:  SUBSECTION SUBSECTI_1
eco <- as(eco, 'sf')
bioEco_sma_sf <- left_join(eco, bioEco_sma, by = c('MAP_UNIT_S' = 'ECOSUBCD'))

plotFIA(bioEco_sma_sf, BIO_AG_ACRE, animate = TRUE, legend.title = 'AG Biomass per Acre', legend.height = .6, 
        line.width = .1)
## nframes and fps adjusted to match transition

 




A work by Hunter Stanke

stankehu@msu.edu